2,221 research outputs found
EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks
For training fully-connected neural networks (FCNNs), we propose a practical
approximate second-order method including: 1) an approximation of the Hessian
matrix and 2) a conjugate gradient (CG) based method. Our proposed approximate
Hessian matrix is memory-efficient and can be applied to any FCNNs where the
activation and criterion functions are twice differentiable. We devise a
CG-based method incorporating one-rank approximation to derive Newton
directions for training FCNNs, which significantly reduces both space and time
complexity. This CG-based method can be employed to solve any linear equation
where the coefficient matrix is Kronecker-factored, symmetric and positive
definite. Empirical studies show the efficacy and efficiency of our proposed
method.Comment: Change to AAAI-19 Versio
- …